53 research outputs found
Coherence retrieval using trace regularization
The mutual intensity and its equivalent phase-space representations quantify
an optical field's state of coherence and are important tools in the study of
light propagation and dynamics, but they can only be estimated indirectly from
measurements through a process called coherence retrieval, otherwise known as
phase-space tomography. As practical considerations often rule out the
availability of a complete set of measurements, coherence retrieval is usually
a challenging high-dimensional ill-posed inverse problem. In this paper, we
propose a trace-regularized optimization model for coherence retrieval and a
provably-convergent adaptive accelerated proximal gradient algorithm for
solving the resulting problem. Applying our model and algorithm to both
simulated and experimental data, we demonstrate an improvement in
reconstruction quality over previous models as well as an increase in
convergence speed compared to existing first-order methods.Comment: 28 pages, 10 figures, accepted for publication in SIAM Journal on
Imaging Science
Sparse Coding Based Image Restoration and Recognition: Algorithms and Analysis
Ph.DDOCTOR OF PHILOSOPH
The Global R-linear Convergence of Nesterov's Accelerated Gradient Method with Unknown Strongly Convex Parameter
The Nesterov accelerated gradient (NAG) method is an important
extrapolation-based numerical algorithm that accelerates the convergence of the
gradient descent method in convex optimization. When dealing with an objective
function that is -strongly convex, selecting extrapolation coefficients
dependent on enables global R-linear convergence. In cases where is
unknown, a commonly adopted approach is to set the extrapolation coefficient
using the original NAG method. This choice allows for achieving the optimal
iteration complexity among first-order methods for general convex problems.
However, it remains unknown whether the NAG method with an unknown strongly
convex parameter exhibits global R-linear convergence for strongly convex
problems. In this work, we answer this question positively by establishing the
Q-linear convergence of certain constructed Lyapunov sequences. Furthermore, we
extend our result to the global R-linear convergence of the accelerated
proximal gradient method, which is employed for solving strongly convex
composite optimization problems. Interestingly, these results contradict the
findings of the continuous counterpart of the NAG method in [Su, Boyd, and
Cand\'es, J. Mach. Learn. Res., 2016, 17(153), 1-43], where the convergence
rate by the suggested ordinary differential equation cannot exceed the
for strongly convex functions
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation
Convolutional neural networks have been widely deployed in various
application scenarios. In order to extend the applications' boundaries to some
accuracy-crucial domains, researchers have been investigating approaches to
boost accuracy through either deeper or wider network structures, which brings
with them the exponential increment of the computational and storage cost,
delaying the responding time. In this paper, we propose a general training
framework named self distillation, which notably enhances the performance
(accuracy) of convolutional neural networks through shrinking the size of the
network rather than aggrandizing it. Different from traditional knowledge
distillation - a knowledge transformation methodology among networks, which
forces student neural networks to approximate the softmax layer outputs of
pre-trained teacher neural networks, the proposed self distillation framework
distills knowledge within network itself. The networks are firstly divided into
several sections. Then the knowledge in the deeper portion of the networks is
squeezed into the shallow ones. Experiments further prove the generalization of
the proposed self distillation framework: enhancement of accuracy at average
level is 2.65%, varying from 0.61% in ResNeXt as minimum to 4.07% in VGG19 as
maximum. In addition, it can also provide flexibility of depth-wise scalable
inference on resource-limited edge devices.Our codes will be released on github
soon.Comment: 10page
- …